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POPULAR VALUES OF THE LARGEST PRIME DIVISOR FUNCTION 


NATHAN MCNEW 


Abstract. We consider the distribution of the largest prime divisor of the integers in the interval 
[2, x], and investigate in particular the mode of this distribution, the prime number(s) which show 
up most often in this list. In addition to giving an asymptotic formula for this mode as x tends to 
infinity, we look at the set of those prime numbers which, for some value of x, occur most frequently 
as the largest prime divisor of the integers in the interval [2, a:]. We find that many prime numbers 
never have this property. We compare the set of “popular primes,” those primes which are at some 
point the mode, to other interesting subsets of the prime numbers. Finally, we apply the techniques 
developed to a similar problem which arises in the analysis of factoring algorithms. 


1. Introduction 


Let P{n) denote the largest prime divisor of an integer n > 2. The distribution of the values of 
this function as n ranges over the interval [2, x] has been considered by several authors. Alladi and 
Erdos [2] investigated the average order of P{n) (as well as the average order of the k-th largest 
prime factor) and showed that 


n<x 


TT^X _^Q f ^ ^ 

12 log X Vlog^x/’ 


( 1 ) 


This fact was later shown by Kemeny m using different methods, and improved upon by De 
Koninck and Ivic, who showed that there exist constants di,d 2 ■ ■ ■ such that for any m > 1, 
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uniformly in m. Naslund m worked out the values of the constants in this expression, in particular 
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The median value, M (x), of P{n) as n ranges over the integers in [2, x] was considered by Selfridge 

and Wunderlich m who noted that M(x) = The result itself is much older, however, 

and was essentially Vinogradov’s trick for extending the usefulness of character sums. Naslund m 
shows that this median value is given more accurately by 
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where the c* are computable constants. 

Note that the median value grows substantially slower than the mean value, which indicates 
that the distribution is skewed strongly to the right. De Koninck [3j shows that a mode of this 
distribution (note that the mode need not necessarily be unique), corresponding to a prime number 
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which occurs with maximal frequency as the largest prime divisor of the integers in [2,x], grows 
even slower still, slower than any power of x. More precisely, he shows the mode is given by 

I log a;(log log x+log log log a;+0(l)) 

though in his result the 0(1) term is incorrectly given as being o(l). In what follows, we will say 
that a prime p is popular on the interval [2,x] if no prime occurs more frequently than p as the 
largest prime divisor of the integers in that interval. While the asymptotic behaviors of the mean 
and median values of this distribution, as in m and are well understood, the relative error 
term in ([5|) is quite large. The primary goal of this paper is to improve ([5]) and in particular give 
the following asymptotic formula, which we prove in Section 01 


Theorem 1.1. If the prime p is popular on the interval [2,x] (i.e., p is a mode of the distribution 
of the largest prime divisor function for that interval) then p satisfies 


p = exp 


|v^i/(x) logx + ^ - 


^{x) — 3 
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where u{x) is the solution to the implicitly defined equation = 1 + n{x) log x — v{x) and is 
given approximately by 

v{x) = ^ log log a: + ^ log log log x — ^ log 2 + o(l) 


as X ^ oo. 


Using this we also give an asymptotic expression for the frequency with which the mode value 
occurs, improving the approximation given in [U Theorem 1]. 


Theorem 1.2. Ifp is popular on the interval [2,x], then the number of integers n € [2,x] for which 
P{n) = p is given asymptotically by 
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In [5] De Koninck and Sweeney consider further the frequency with which prime numbers occur 
as the largest prime divisor on the interval [2, x\. They note that for a fixed value of x there exists 
an initial interval [2, f{x)] of primes, p on which the frequency with which p = P{n), monotonically 
increases at each prime, an intermediate range, {f{x),g{x)) where the behavior is oscillatory, and a 
final interval [g{x),x\ on which it monotonically decreases. They show that for sufficiently large x, 
/(x) < y/kfgx and g{x) > y/x. Clearly the mode value lies somewhere in the intermediate interval. 
The oscillatory behavior and the exact value of the mode depends on the spacing and gaps between 
the primes near this peak value. 

Somewhat surprisingly one finds that there are primes which are not popular on any interval 
[2,x], and experimentally it appears that in fact most primes are not. We therefore define a prime 
to be a popular prime if it is popular on an interval [2, x] for some value of x. In Section [5] 
we investigate further this subset of the primes. Clearly there must be infinitely many popular 
primes. We are able to show that there is also a positive proportion of prime numbers which are 
not popular. To do this we show that the average prime spacing between popular primes cannot 
be too small. We prove a more general result which implies the following bound on their spacing. 


Theorem 1.3. Given any two sufficiently large consecutive primes, p < q, if the gap between them, 
q — p, is less than 0.153 log p, then p is not a popular prime. 

We then combine this with a consequence of the GPY sieve [6] which shows that a positive propor¬ 
tion of prime gaps are smaller than that. 
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Corollary 1.4. A positive proportion of primes are not popular. 


In Section [6] we present data on the prime numbers which, for some value of x < appear 

most frequently as the largest prime divisors of the integers in [2,x]. We compare these values 
to other subsets of the prime numbers, in particular the “convex primes,” the set of those prime 
numbers numbers, Pm which form the vertices of the boundary of the convex hull of the points 
{n,pn) in the plane, considered by Pomerance [13] and recently by Tutaj [19]. Within the range of 
our computations the convex primes are a subset of the popular primes. 

Finally we apply the methods developed in this paper to another problem which turns out to 
be closely related to ours, the analysis of the running time of factoring algorithms. A key step in 
several algorithms for factoring integers (including Dixon’s random squares algorithm, the quadratic 
sieve and the number field sieve) requires generating a pseudorandom sequence of integers oi, 02 ,... 
until a subset of the afs has product equal to a square. Pomerance [H] notes that in the (usually 
heuristic) analysis of these algorithms one can assume that the pseudo-random sequence 01 , 02 ,... 
is close enough to random that one can make predictions using this assumption, and thus the 
analysis of this step of these algorithms can be captured by the following question. 


Pomerance’s Problem. Select positive integers 01 , 02 ,-•• < x independently at random (each 
integer is chosen with probability 1/x) until some subsequence of the Oj’s has product equal to a 
square. When this occurs, we say that the sequence has a square dependence. What is the expected 
stopping time of this process? 


Pomerance m showed for any e > 0 that as x 
in the interval 
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exp 
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tends to 1. Croot, Granville, Pemantle and Tetali [3] showed that the interval can be taken to 

with the same result, where h{x) is the maximum value of the 
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consider They give only the same crude approximation 
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as Pomerance however. In Section [7] we analyze the values of y which maximize both and 

, and give the following asymptotic for the function h{x). 


Theorem 1.5. For a given value of x, the value of h{x), the maximum value of for y < x 

is given asymptotically by 


h{x) = 


yJ2TT log X 
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the same expression as ([6]). 


2. Smooth Numbers 

These results rely on careful estimates for the counts of smooth numbers, those integers whose 
prime factors are all less than some bound. In particular a number is called y-smooth if all of its 
prime factors are at most y. We will denote by 'k(x,y) the count of the y-smooth numbers up to 
X. We are specifically interested in the count of the number of integers up to x whose largest prime 
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factor is the prime p. This count is given by T since each integer up to x whose largest 

prime divisor is p can be written uniquely as p times a p-smooth number that is at most x/p. 

The function y) has been well studied over the course of the last century. Prom Hildebrand 
[9] we know that for each e > 0, x > 2 and exp ((log log < y < x, 


where 


T(x,y) = xp{u) [l + Oe 

log X 
logy 


log(n + 1 ) 
logy 


( 7 ) 


u = 


and p{u), the Dickman rho function, is the continuous solution to the differential delay equation 

up'{u) + p{u — 1 ) = 0 ( 8 ) 

with the initial condition p{u) = 1, (0 < m < 1). It was shown by Alladi [T] that as u —oo, 

rCW — 1 
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Here 7 is the Euler-Mascheroni constant and ^{u) denotes the unique positive solution to the 
equation 

= 1 + uC{u) ( 10 ) 

which is given approximately by 

/ log log u ' 


^ (u) = log u + log log u + O 
It will be useful later to note that 
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Saias |16] gives an approximation for T(x,y) which, while better than Hildebrand’s result, is 
somewhat more cumbersome to work with. Defining 

Togx - logt\ [tj 
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logy 
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then the approximation 


T(x,y) = A(x,y) 1 + 


exp ((log y)^/® ’^) 


holds in the same range as Hildebrand’s result. Assuming the Riemann Hypothesis, this can be 
improved to 

log X 


^'(x,y) = A(x,y) 1 + 

Saias also shows that the asymptotic expansion 
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where the aj are the coefficients for the Taylor series of (s — l)C(s)/s around s = 1, holds uniformly 
for X >2, (logx)^"'''^ < y < X as long as 

u- 3 ^ log log y 

k + l- j ~ logy 

for 0 < j < min(A:,rt). We will use extensively Saias’ expansion in the case k = 1. In particular, 
the constants oq and ai are given by oq = 1 and ai = 7 — 1 so that if we define 

n{x, y) = p{u) + (7 - 1 ) 7 -^ 

logy 

then the approximation 

T(x,y) = XK{x,y) ^1 + 0, ^^ ) )) 


holds in the same range as ©• 

In order to make use of Saias’ improved approximation we will also require a better approxi¬ 
mation of p{u). Both Smida [l7] and Xuan m have given improved approximations in which the 
(l-|-0(i)) is replaced by a series involving negative powers of u and ^{u). Xuan shows that for 
any fixed integer N, 


p{u) 



7 - u^{u) + 
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T + On 



where the bij are constants and the series is uniformly convergent. We will only be using his result 
in the case that N = 1. Smida’s work, which is done in greater generality for a family of differential 
difference equations like Dickman’s function, shows that 617 = 

Finally, Hildebrand O Theorem 3] gives an upper bound for the number of smooth integers in 
short intervals which we will useful. Uniformly for x > y > 2, 1 < z < x we have 


T (x -h z, y) - ^'(x,y) < 



T(x,y)ylog(xy/z) 

T(xy/2,y)logy 
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3. Dickman’s Function 

The approximation 

h^ = <w(l + 0(i)) (17) 

is common in the literature. (See for example |18l Section III.5 Corollary 8.3].) We will need a 
slightly stronger form obtained using the work of Smida and Xuan. 

Lemma 3.1. For u > 1 and any u <C 1 the function p{u) satisfies 
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Proof. By implicit differentiation of the functional equation = 1 + uf,{u) we find that 
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Now, using equation (IlSp with = 1, along with (|12p and the approximation 
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Here we have used the Taylor expansions for \/l + x and e* around x = 0. Finally, using equations 
m and ([2^ we have 
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We can use Lemma [3TT] to obtain a good approximation for the derivative of p{u). 

Lemma 3.2. For u> 1 we have 

p'iu) = -p{u) (i{u) + ^ (l + (^(^^^1)2 ) + O (^) ) • (23) 

Proof. Using the differential difference equation for p{u) and Lemma l3.II we have 

p'{u) = -lp{u - 1) = -p(u) ({(„) + L (l + + o (^) ) • 

□ 


4. The most popular largest prime divisor 


For X > 2 we say that a prime p is popular on the interval [2, x] if no prime occurs more frequently 
than p as the largest prime divisor of the integers in that interval. In the case of a tie we will say that 
any prime which occurs a maximal number of times is popular. The following theorem, Theorem 
1.11 in the introduction, makes use of Saias’ approximation (11411 . In particular, this result implies 
that for each e>0, x>4, p>2 and exp ((log log < p < ^, 

2 \ \ 
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Theorem 4.1. If the prime p is popular on the interval [2,x] then p satisfies 


p = exp 


\/u(x) logx + 
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where u(x) is the solution to the implicitly defined equation v{x) = ^ ~ l) given 

approximately by 

v{x) = ^ log log X + ^ log log log X — \ log 2 + o(l) 


(26) 


as X ^ oo. 


Proof. By using the functional equation (IIUI) for ^(u), we can rewrite the equation for u(x) as the 
solution to 

= 1 _|_ y/i/(^x)logx — u(x) (27) 

which can be approximated using standard asymptotic techniques to yield the rough approximation 
above. 

The proof proceeds in three steps, each giving better bounds for any prime that is popular on 
the interval [2, x]. We show first that as x —>■ oo, if p is popular on [2, x], then p satisfies 
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p = exp I Y^i/(x)logx + 0(log log x)| , 
and finally that the approximation (|25p holds. 

To see that T is maximized near (I28p . we first set 
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If p' is the greatest prime less than or eqnal to Pq then 'I' ^ hence if 

(f’^) some prime q, then q is not popular on [2,x]. 

Note that by definition I'^x) = C{uo). We then compute, using ([7]) as well as ([9]) that 
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for sufficiently large x. Using the elementary estimate 'i>{x,y) <C x exp | — 2 \ogy 1 1 3) > y > 2, 
(see [THl Section III.5 Theorem 1]) we see that for any e > 0 and sufficiently large x that if 
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which is asymptotically less than (|30p . Similarly, if y > exp 


|(2 + e)y^ x) logxj, then trivially 


^ < y < ®exp|-(2 + e)v^i/(x)logx| , (31) 

which proves (I28p . We can thus assume without loss of generality in the following that a prime 
popular on [2, x] must lie in the range where Hildebrand’s approximation ([7P holds, which we now 
use along with ([9]) and (fT2]) to prove (l2^ . 

Suppose q is a prime lying in the interval (1281) . also satisfying 

I log Po - log yI > 2iy{x) . (32) 

We will show that for sufficiently large x, 'I'^-^,Po^ > ^ which means that some other 

prime occurs more frequently than q as the largest prime divisor on the interval [2 ,x], which will 
then imply (l2^ because iy{x) = O(loglogx). 
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Letting Ug = - 1 and, as before, uq = - 1 = - 1, we have, using 
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First, if g < Pq then uq < Ug and so, using (|2n]l we know that > 1, at least for sufficiently 

large x, and that ^(t)dt > [ug — uo)?(^^o)- Using these inequalities we see that the main term 
in (|33l) is greater than 

' log X log X 
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Differentiating this with respect to log q gives 1 — which is negative for all q < Pq, and so, as 
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This is not only greater than 1, but also asymptotically greater than the error term of (13311 . and 
so we can conclude that the ratio there is strictly greater than 1, for sufficiently large x. Therefore 
some other prime occurs more frequently than q as the largest prime divison on [2,x]. 

If instead, g > Po, we have Ug < uq which means < 1, and so a little more care is required. 

Let 5 = Uq — Ug. Because logg — logPo > 2z^(x), we will have that 
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for sufficiently large x. Also, from (|28ll . we may assume that for any fixed 0 < e < ^ and sufficiently 
large x, log g < (2 + e) log Po and so 
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In this case we can use (|20p to approximate 


< Mo 1 - 


2 + 6 


< 


3uq 


(36) 


e(uQ) 

^’(Uq) 


^(uq) 

^{uq - 5) 

^(uq) 


e(Mo) + 0(5e(Mo)) 


(uq - 5)({uq - (5) - Mq + d + 1 
WoC(^io) - Mo + 1 

,5(g(Mo) + 0(l)) 

UoC(uo) - Mo + 1 


1 - 


+ 0 


1 


UqC{uq) 


= 1 -+ 0 

Mo 


wo?(wo) 


9 



































If we now use the somewhat more precise approximation 
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using ([20|) . Using (1361) we see that when 5 > y/uQ this expression is greater than 
sufficiently large x. If <5 < ^/uo■, then we can rewrite ([551) 
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which is greater than 1 for sufficiently large x since 5 > 2. Since this term (I38p can now be ignored 
in inequality (I37p we are left with the same inequality (I34p as in the first case, and essentially the 
same analysis shows that the ratio is again greater than 1. This proves equation ([290 . 

In order to prove the theorem, we will now need to use the more precise approximation (|14p . In 
particular we have, using that u = — 1 and that p is in the interval described in (l28p . 
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In the following we will optimize the value of c as a function of x, however, from (j32|) . we can 
assume without loss of generality that |c| <2. In particular, we would like to choose c so as to 
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and so, using Lemma l+Tl and the fact that v{x) = ^(uq), 
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Also, we see that the final term of ()42p can be ignored since 


1 + 


1-7 


Y0(xyioglr+ci/(3;) 


i{Us) 


1-7 


1 + 


1-7 


yd+ryTo^ 


^^( 110 ) 


= 1 + 


= 1 + 


= 1 + 


yjy{x) log x-\-cv{x) 


C{Us) - 


1-7 


y/uix) logx 


c^i^(x)+l\ 1 

/ / 


^{uo) 


(44) 


1 + 


1-7 


yO+yio^ 


i{uo) 


1-7 


y/p(x) logx 


{^(us) - ^(uo)) + O 


log X 


1 + 


1-7 


=^(uo) 


1-7 


logs 


yO+yio^* 

:^'(uo) (Us -uo) + 0 


1 + 


1-7 


y/i'ix) logx 


i{uo) 


— 1 + O 1 —^ 



, u 


(45) 


Using (1441) and (I45|) in the ratio 

flf.*) 


we have that 




= exp < —(?v{x) 




1 + 


Mo+1 2uo(i^(2:)-l)y 2uq V (pix)-!) 


+0 








(46) 
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and so maximizing this ratio is equivalent to maximizing the polynomial expression in c appearing 
in the exponent. After rescaling by dividing ont a factor of i^(x) + 2 uo(v(x)-i ) ) expression 

is 


-c^ + 


1 + 


(u(x)-iy^ 


\ii^ + 2{u(x)-l) 

which is maximized by some c satisfying 


+ 0 — + 


1 


Uq iy{x)uQ 


1 + 


c = 


{u{x)-iy^ 


V ilfh + 2{u{x)-l) 

1 / u{x)'^ — 2v{x) + 2 

2i'{x) \2i/{x)‘^ — 2>v{xY + 1 


+ 0 


.3/2 


+ 


1 


+ 0 


\/ W 0 ^/v{x)uq^ 

1 \ 


4i/(x) 


1 - 


v{x) — 3 


2v{xY — 3i'(x) + 1 


+ 0 


V^xWo, 

1 


vMx)no^ 


(47) 


(48) 


Using this expression for c in (1411) . we see that the ratio (14211 is maximized when s satisfies the 
expression given in (I25p . □ 


We can use this result to give an asymptotic for the number of times that a prime which is popular 
on [2, x\ appears as the largest prime divisor of an integer on that interval, which we denote by 
C(x), thus giving the height of the peak of the distribution of P{n) on the interval [2,x\. (Note 
that if multiple primes are popular on [2, x], they occur the same number of times on that interval, 
so the function C{x) is well defined for all x.) This theorem is Theorem 11.21 in the introduction. 


Theorem 4.2. If p is popular on the interval [2,x], then C{x), the count of integers n E [2,x] for 
which P{n) = p, is given asymptotically by 


C{x) = 


X 


y/‘2'K log X 


exp < 


-2v^zy(x)logx+ J ^-j^ds+ +7+0 

0 


(49) 


Proof. We know from the above theorem that if p is popular on [2, x] then 

p = exp I Vi/(x) logx + ^ + O I . 


Using dH), 


where 



-p{u) 

p 


(^1 + 0 


/ log(l + n) \ 

V logp J 


u = 


logx 

logp 


1 


log X 

V^zy(x)logx + i + o(^) 



1 

4i/(x) 


+ 0 



- 1 


(50) 


(51) 


(52) 
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Now, 


(( n )=((, m-l + 0 ^ ' 


v{x) 


v{x) 


= i 


v{x) 
= v{x) + O I 

so, using ([9]), along with (f20]l we see that 




v{x) 


\Jv{x) log X j 


p{u) = ( 1 + o ( - 


e'(n) j ^e* - 1 j 

—-— exp < 7 - u^iu) + / - as 

2vr Jo s 


= 1 + 0 


1 


1 


^{x) J J y 2ttu 

1/4 


exp < 7 — 


{x)+ [ 

Jo 


— 1 


ds + O 


- 1 


z/(x)v^ x) log x 


exph-AWlogI + Wx) + l + ^ 


Combining this with (15011 and (|51l) we have that 


1 /-K^) e" _ 1 / 1 

-ds + O 


iy{x) 


V27rlogx 


exp < 


-2Vp(x)1o6X+ j ‘-^d.,+^+y+0 


where we have also used (|^ to see that logx)^/"^ + O f 


(53) 


(54) 


□ 


Note that, asymptotically, -^—us — 
expression in (1491) is given approximately by 


^ + = y^ + o(^), and so the 


u{xy^ 


x exp I — yj 2 log X (log log x + log log log X — (2 + log 2) + o 
which is the estimate given in [H Theorem 1]. 


(55) 


5. Popular primes 

Having seen that the value of any prime which is popular on the interval [2, x\ tends, slowly, to 
inhnity and takes on prime values, one might expect that every prime number is popular on some 
such interval. This turns out not to be the case. We define a popular prime to be a prime number 
which is popular on some such interval [2 ,x]. 

In what follows we will see that not only are there prime numbers which are not popular, but in 
fact there is a positive proportion of primes which are not popular. First however, we use Theorem 
O to give a lower bound for their count. 

Corollary 5.1. There exists an absolute positive constant C such that the count of of the popular 
primes up to x, for x > 10, is at least 
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Proof. Theorem 14.11 implies that there exists an absolute constant C such that for any popular 
prime, p, popular on the interval there exists another popular prime in the interval 



^1 + C' 


/ / log logx^ 
I \ log x' 



(56) 


Setting y = p we have logy = -^log x' log log x' + O (1), and so we see that for a suitably large 
choice of C" and any y there is a popular prime in the interval 



l + C" 


Tog logy 
logy 


(57) 


If we restrict to counting popular primes appearing in in intervals of the form (|57p where 

y is greater than then we may assume that 1 + C ”> 1 + C” an- 

/ / log log X \ 

V log a: / 


other constant C". The number of non-overlapping intervals of the form (y, y (1 -|- C" 
between and x is 


5 log X iog“'" X 

log (i + 


log’'" 


and the result follows. 


(58) 


□ 


Before we can prove an upper bound for the distribution of the popular primes, we need first a 
version of the Buchstab identity for the function T(x,y) defined earlier. 

Lemma 5.2. Let pn denote the nth prime number. For any k>l, 

/ \ / \ k / X 


4' 


Pn+k 


t Pn+k 


= ^(^,Pn) 


Pn-\-k 


2 = 1 


Pn-\-kPn-\-i 


? Pn-\-i 


(59) 


Proof. The left hand side, T ( :;r^,Pn+k) counts those integers at most x whose largest prime 

\Pn+k J 

factor is Pn+k- Taking such an integer m, and dividing out a factor of Pn+k we obtain an integer, 
, at most ^ whose largest prime factor is either less than or equal to pn, in which case m is 

counted by 'k ( ^ ), or its largest prime factor is Pn+i for some 1 < z < /, in which case m is 

\Pn+k' ) 

counted by □ 

We can use this lemma to show that the average prime spacing between popular primes cannot 
be too small. 


Theorem 5.3. If the primes pn and Pn+k o^re any two popular primes satisfying 


then the average prime gap between these primes must satisfy 

Pn+k - Pn y / loglogyn TT y(2 - a) log Pn 

k “V V logPn // 2-0 

where a = . 

logPn 


(60) 


(61) 
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Proof. Suppose that pn and Pn+k, k > 0 are any two popular primes satisfying (l60]l and let a = 
• Because both pn and Pn+k are popular, there exist integers Xn and Xn+k such that pn 
is popular on the interval [2,Xn], and likewise Pn+k is popular on [2,x„_|_fc]. 


Now, as X increases, the function 'I' is nondecreasing, in fact, as x increases through the 

integers, the difference 'I' i® either 0 or 1. So, in the case that Xn+k > we 

have that 


>1' I —,P„+f) < W f-.pd < (^,P„) < <l(^.Pn+t 

yPn+k ) \Pn ) \ Pn ) \Pn+k 


(62) 


Thus, we see that as x increases from to Xn+k^ there must be an intermediate integer x' between 
Xn+k and Xn for which 


\Pn+k ) \Pn 


(63) 


Note that it need not necessarily be the case that Xn+k > Xn, however the case that Xn+k < Xn is 
essentially identical and we again find an integer x' between these values satisfying (j63p . 

By Theorem 14.II we know that both 


logPn = y/u{Xn)logXn + \+ o{ / ' ) 

4 \^{Xn)J 


and 


logp„+fc = yju{Xn+k)^OgXn+k + \ + O { -r ) • 

4 \v{Xn+k)J 

Since logpn+k — logPn = O ^ and x' lies between and Xn+k we must have that 

logPn = V ^{x') log x' + i + O • (64) 

Set uo = —1. Using Eauation[63l Lemma[52]and the approximation T(x, y) = (l+O (4^) xp{u) 

we can write 


\Pn J \Pn+k 


X 


Pn = 


2 = 1 


X 


Pn-\-kPn-\-i 


?Pn+2 


= 1 + 0 


Mo 


^ x' ^ / log X- log Pn+k - Pn+i \ 
^Pn+iPn+k^\ log Pn+i / 


Using Lemma EH 


log x' - log Pn+k - log Pn+i \ ^ 2logPn + 0 ^ 


log Pn+i 


= P 


logj.„(l + o(j5i^)) 

= p f (mo - 1) f 1 + O 


1 


log PnJJ 

= ,i + 0|||^))pK-i) 


= 1 + 0 


logPn 


p(mo - 1 ). 


( 66 ) 
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Since 


we have that 


k k 

yJ- = y _L 


logp„ 


= ±fl + of^ 

Pn V VOg Pr. 


(67) 


^ I —,Pn 

<Pn J \Pn+k 


,pn] = (l + O^ ^ 


x'k 


^0 / / PnPnH-/c 


-p(no - 1). 


( 68 ) 


On the other hand, using Hildebrand’s upper bound for the count of smooth numbers in short 
intervals with z = ^ — we have that 


PnPn+k 


Pn+k 


vhl -,pj - 

\Pn J \Pn+k 


<1 + 0 


1 


Pn-\-k 




+ Z,Pn - 


Pn-\-k 


-^Pn 


Pn + k 


^,Pn)pnlog(|^) 




= (l + 0( - 
.Wo 


= (l + 0( - 
.Wo 


= 1 + 0 


Wo 


Using (I^UIl to see that 


l+p( ‘°+:‘r" )(-°i^^'--°g^ + o(i+r)) 

x'p-n „ f ^Ogx'+logPn-logPn+k-^Ogz\ ,_ 

.P[ logp„ yogpn 


^Pn+k ^ 


zp(wo) (^(2-a)logpn + 0(^j^)) 

(2 - a)p (uq) x'{pn+k - Pn) 
p{2-a) PnPn+k 


(69) 




= e 


log x' 


Y^wOrOloilU + 0(1) 


- 1 


I log x' 
u{x') 


-1 + 0 


v{x') 


and, from the functional equation (I27h for i'{x), that 

v{x') = log (^1 + ^Jv{x') log x' - v{x')^ = log ilogPn “ v{x') + 0(1)) ) 

= log log Pn + o(l) 


(70) 


(71) 
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we can conclude, by combining (l68|) and (1691) . and nsing (fT7|) that 


Pn+k - Pn ^ / P(2 - a) 

k “ y 2 — a 
p{2 - a) 


+ 0 


+ 0 


2 — a 

p{2-a)\ C{uo)logx' 


P{u - 1 ) 

p{uo) 

uo^{uo) 


= 1 + 0 


9 I / / /M -7 + 0(C(lfo)) 

z — a / logx' 

floglog Pn\\ p{2- a) log Pn 


V ^Og Pr, 


2 — a 


( 72 ) 

□ 


As a corollary, we see that for any snfficiently large pair of twin primes, or consecutive primes 
with any fixed gap, the smaller of the pair will never be a popnlar prime. In fact, approximating 
p{2)j2 = 0.153... we have the following stronger result, which is Theorem [T] in the introduction. 

Corollary 5.4. Given any two sufficiently large eonseeutive primes, p < q, if the gap between 
them, q — p, is less than 0.153 logp, then p is not a popular prime. 

Goldston, Pintz and Yildirim [6] have shown that for any fixed p, there is a positive proportion 
of prime nnmbers, p, which are followed by a gap less than r/logp, which means we can conclude 
the following, Corollary 11.41 from the introdnction, as well. 

Corollary 5.5. A positive proportion of the prime numbers are not popular. 

Note that if we assume that the smooth numbers are regularly distributed in all of the short 
intervals that we are concerned with in the proof of Theorem l5.3l we can do much better. Assuming, 
as is widely conjectnred, that 


^{x + z,y)-^{x,y)^^{x,y) (73) 

for y ~ exp zz(x) log x + and z > x/y‘^, we could show, by the method of Theorem 15.31 that 

the average gap between any two popnlar primes p and q, p < q, must be asymptotically equal to 
logg, and thus that the popular primes have relative density 0 among the primes. 


6. Computations and the Convex Primes 

Compiling a list of the popular primes is computationally difficult, as it requires counting all 
of the largest prime divisors of integers up to relatively large values of x compared to the pop¬ 
ular primes themselves. The first few popular primes (popular on some interval [2, x] for some 
X <70,000,000,000,000) and the integer x for which they were first popnlar on the interval [2, x] are 
given in the table below. Note that thus far no prime has been a popular prime without being the 
uniquely popular prime on some such interval. Further, the table gives the count of the nnmber of 
times the prime occnrs as the largest prime divisor of an integer in the interval [2,x]. 
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Primes popular on some interval [2, x] for x < 10^^ 


Popular 

Prime 

Pirst popular 
on [2, x] 

First uniquely 
popular 

Last popuiar 
on [2, z] 

C{x) 

C{z) 

2 

2 

2 

17 

1 

4 

3 

3 

12 

119 

1 

14 

5 

45 

80 

279 

8 

25 

7 

70 

196 

1858 

10 

77 

13 

1456 

1638 

5471 

67 

151 

19 

4845 

4864 

29301 

140 

428 

23 

20332 

22425 

53474 

344 

616 

31 

46345 

46500 

117303 

563 

1005 

43 

106812 

109779 

220523 

947 

1517 

47 

153032 

158625 

611374 

1197 

2902 

73 

592760 

603564 

2642391 

2846 

7664 

83 

2484190 

2552416 

2672025 

7357 

7722 

109 

2620033 

2620142 

2952463 

7621 

8284 

113 

2623860 

2627250 

41192601 

7629 

48380 

199 

41163150 

41163747 

237611044 

48357 

161644 

283 

237321819 

237398795 

1967277194 

161507 

698074 

467 

1966462280 

1966466950 

13692930957 

697875 

2761234 

661 

13690728506 

13690729828 

64358549949 

2760913 

8357693 

773 

64322151699 

64322158656 

79880100420 

8354317 

9758410 

887 

79838726306 

79838739611 

220369251374 

9754751 

20285553 

1109 

220355977754 

220355987735 

232880841877 

20284680 

21123128 

1129 

232268764689 

232268774850 

618765808209 

21082412 

43031555 

1327 

618745965579 

618745972214 

1882062587041 

43030537 

96835113 

1627 

1882062393429 

1882062476406 

9607847299025 

96835105 

318539488 

2143 

9607711921430 

9607713772982 

19364476224949 

318536223 

534261087 

2399 

19364051434020 

19364051829855 

26396066576762 

534252383 

672081919 

2477 

26393150922356 

26393150937218 

37636861534247 

672026918 

873949289 

2803 

37636607775855 

37636607806688 

84128837898779 

873944930 

1588958920 

2861 

84128837864448 

84128837898780 

85992223800357 

1588958920 

1612740571 

2971 

85992223734996 

85992223800358 

89487767416445 

1612740571 

1656313907 

3023 

89487767413423 

89487767416446 

90749798232275 

1656313907 

1672851087 

3041 

90749798153210 

90749798232276 

91157523869191 

1672851087 

1678444884 

3049 

91015395545226 

91015395548275 

91473520711546 

1676495503 

1682728352 

3089 

91473520705369 

91473520711547 

92913565436551 

1682728352 

1699108828 

3137 

93871134565472 

93871134606253 

94131107722837 

1708870682 

1712113344 

3373 

94131107675616 

94131107722838 

> 10^^ 

1712113344 

> 1791544685 


Note that the ranges of popularity for 73, 83, 109 and 113 all overlap, and in fact all four are 
popular on the interval [2,2626355], each occurring 7634 times. 

Thus far, the data for the popular primes appear to be related to a subset of the prime numbers 
studied by Pomerance [13] and Tutaj [T9] and also discussed in Guy’s book of unsolved problems in 
number theory [7[ Problem A14]. This set, the “convex primes,” is the set of those prime numbers 
numbers, Pn, which form the vertices of the boundary of the convex hull of the points {n,pn) in the 
plane. Pomerance uses this set of primes to show that there are inhnitely many primes pn which 
satisfy the inequality 

< Pn-i + Pn+i for all positive i < n. 
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Using the best known error term for the prime number theorem, Pomerance claims that there are 
at least exp(c(log convex primes up to x for any e > 0 and some constant c > 0. Assuming 

the Riemann hypothesis gives at least j log^^^ x convex primes. 

The values of the popular primes computed above are a superset of the convex primes: all of the 
convex primes less than 3000 are also popular. Furthermore, all of those primes, pnj where the point 
{n,pn) lies on the boundary of the convex hull but is not a vertex point of it (namely 5, 13, 23, 31 and 
43) are popular as well. The popular primes 83, 109,773,1109,2143,2399,2477,2861,2971,3023, 
3041, 3049, 3089, 3137 and 3373 correspond to points on the interior of the convex hull, however. 

Both convex primes and popular primes are more likely to be found after a run of densely 
packed primes, and prior to a larger than average gap betwen primes, which partially justifies 
the connection. If one assumes that the convex primes continue to be a subset of the popular 
primes, then we would expect the count of the popular primes up to x to be at least log^^^ x, 
substantially better than what we are able to prove in Corollary 15.11 In a forthcoming paper we 
will further discuss the convex primes, including a significantly improved upper bound for their 
count. 


7. Optimization of factoring algorithms: making squares 


As mentioned in the introduction, the analysis done here is closely related to a key step in the 
analysis of the running time of a variety of factoring algorithms. In particular, one wishes to choose 
an optimal smoothness bound y so as to minimize the number of random integers that must be 
chosen from the interval [l,x] before the product of some subset of the integers chosen at random 
is a square. When some subset of the integers has this property we say that the set has a square 
dependence. Since the probability an integer chosen at random from the interval [l,x] is y-smooth 
is and any set of 7r(y) +1 y-smooth integers contains a square dependence, it is advantageous 

to pick a value of y which minimizes the expression or equivalently maximizes 


'^(x,y) 

7r(y) 


= 1 + 0 


1 


^(x,y)logy 4'(x,y) 


(74) 


iogyyy y y 

The analysis of the maximum value of is highly similar to the analysis of the peak value 

of T performed in Section 01 In fact, maximizing requires maximizing the same 

expression (|40p as in the proof of Theorem 14.11 with the modification that now u = , rather 

than that value shifted by one. One thus finds that after suitably modifying the implicitly defined 

function v{x) used in the proof, replacing it instead with the function oj{x) = ^ 

satisfies the functional equation 

e‘^(^) = 1 + \/ijj{x) log X, (75) 

and, like u(x) is given approximately by 

ui{x) = ^ log log X + ^ log log log X — ^ log 2 + o(l) (76) 

as X ^ oo, the exact same analysis goes through and one obtains the following. 

Theorem 7.1. If, for a given value of x, the prime p maximizes the expression , then 


p = exp 


■\/u}{x) logx + 


1 - 


uj{x) — 3 


2a;(x)2 — 3a;(x) + 1 



1 + 0 


p 

/loglogx\ 

V y 


(77) 
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Comparing the functions ^{x) and uj{x), we find that 


= ^ log iy{x) - ^ loga;(x) + log ( 1 - 


^{x) — Uj(x) = log(\/iy(x) log X — l^(x) + 1) — log(Y^ Uj(x) log X + 1) 

+ o 

iogx j 

v{x) — Uj{x) 


I l^{x) 1 , 


Uj{x) 


logx 

+ 0 


1 


Vlog X log log X 
1 


\/log X log log X 


I v{x) 
logx 


+ 0 


1 


Vlog X log log X J 


( 78 ) 


We can use this to restate Theorem o in terms of the function z^(x) for comparison to Theorem 

O 

Corollary 7.2. If, for a given value of x, the prime p maximizes the expression then 


p = exp 

Proof. Using ([78|) . we see that 


v'Kx)logx+5 + o(ij^)}. 


(79) 


■\/ijj{x) log a: = 


1 


n{x) + 


I v{x) 
logx 


+ 0 


\/log X log log X 


log X 


\ 


v{x) log X + sjn[x) log X + O 


I logx 
log log X 


(80) 

□ 


= v'Kx)l06X + i+0(i^^). 

Using this approximation in (|77)l the result follows. 

The method of proof can also be adapted to maximize the function which is slightly 

more relevant to the optimization of these factoring algorithms. Using the approximation 7r{y) = 
logy logy ^ ( log^^y )) again, the analysis is nearly identical to that of Theorem 

lO with the function uj{x) used in place of i^(x). However, instead of equation ([l2]) . we find that 
we are maximizing the ratio 


T (x,s)7r(Po) 

T {x,Po)tt{s) 


(i + Iga-.)) (i - 4?) 

Po (l + (l “ 13^) 



where uq and Ug have been suitably modified. 
As before, the term 



can be absorbed into the error term, however the additional ratio of 
an additional in the exponent of (j46p . 


1 + — 

UO 


(81) 


introduces 
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As a result, when we maximize c, we find that it now occurs for some c satisfying 


c = 




1 - 


3cij(x) — 5 


+ 0 


1 


^uj{x)uo 


(82) 


6a;(x)^ — 9a;(x) + 3^ 

Thus we can conclude the following asymptotic, usefull in determining the optimal smoothness 
bound for use in integer factorization. 

Theorem 7.3. If, for a given value of x, the prime p maximizes the expression , then 


p = exp I y/ooix) logx + ^ “ 


3uj{x) — 5 


= exp 


6a;(x)2 — 9a;(x) + 3 



1 + 0 


t(p) 

/loglogx^ 
\ logx 


(83) 


oo, the ratio of the prime, p, which maximizes 


^ log log x ^ 

Note that (j83p implies that in the limit as x 
to a prime popular on [2, x] tends to e. 

Having estimated the value of y which maximizes relatively precisely, we can likewise give 

an estimate for the maximum value of this function. Note that the maximum value of this function 
is what plays a key role in the analysis of factoring algorithms. Denote by h{x) this maximum 
value of taken over all y < x. Croot, Granville, Pemantle and Tetali showed [3] that if one 

chooses integers at random between 1 and x until the sequence contains a square dependence, then 
the expected stoping time lies in the interval ( ^^^ ^ + o{l))-j^, {e~'^ + , and futhermore 

that as X —>■ oo, the stopping time lies, almost surely in this interval. The only estimate that 
they give for h{x), however, is that h{x) = xexp | —-^(2 + o(l)) log x log logxj. (In their notation, 
Jo(x) = j^-) We give here an asymptotic expression for the value of this function, proving 
Theorem II 51 in the introduction. 

Theorem 7.4. For a given value of x, the value of h{x), the maximum value of for y < x 

is given asymptotically by 


h{x) = 


X 


yJ2'K log X 


exp 


— 2y^ oj{x) log X + 


- ds + + 7 + 0 


1 


or, equivalently, the same expression with v{x) in place ofuj{x), 


h{x) = 


\/ 27 r log X 


exp 


= C(x) 1 + 0 



++ e"-l , 3i/(x) 

- ds + —^+-f + 0 


log log X ^ 


1 ^ 
log log X ^ 


(84) 


.logiogx;)' 

where C{x), defined before Corollary \4.‘^ is the number of times a prime, popular on [2,x], appears 
as the largest prime divisor of an integer on that interval. 

Proof. Because n^y) = ^1 + 0 the proof is essentially identical to that of Corollary 

14.21 (again using uj{x) in place of z^(x)) with the exception that in ([521) we now have u = 
which causes us to lose a factor of oj{x) in the exponent of the expression (j54p . and that the final 
expression is multiplied by a factor of 


1 + 0 


1 


vMxJlogx^ 
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1 + 0 


1 


vMxJlogx 


log y = A/a;(x) logx 






































which then restores that factor of oj{x) to the exponent. 

Using this, we obtain (l85|) by using ([HOll (which decreases the exponent by 1 when using v{x)) 
along with the observation that 


L 


- ds = {u}{x)-u{x ))—+ O {uj[x)-u{x)) --- —— 

u(x) s v{x) \ \ uj{x) v{x) 


u{x) 


+ 0 


1 


l\ogx 


logx \y/u{x)logX I I \\] 


- 1+0 


I v{x) I /log a: logx 

logx I y u;{x) y ^{x) 


= 1 + 0 


1 


i/(x )) ’ 

which, in turn, increases the exponent by 1. 


□ 
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